open machine learning course
GitHub - girafe-ai/ml-course: Open Machine Learning course
Warning, repository has been renamed to represent its current status. This course aims to introduce students to modern state of Machine Learning and Artificial Intelligence. It is designed to take one full year - approximately 2 * 15 lectures and seminars. All learning materials are available here, full list of topics considered in the course are listed in program_*.pdf Although if you don't have any of this, you could substitude it with your diligence because the course provides additional materials to study requirements yourself.
Open Machine Learning Course. Topic 6. Feature Engineering and Feature Selection
In this course, we have already seen several key machine learning algorithms. However, before moving on to the more fancy ones, we'd like to take a small detour and talk about data preparation. The well-known concept of "garbage in -- garbage out" applies 100% to any task in machine learning. Any experienced professional can recall numerous times when a simple model trained on high-quality data was proven to be better than a complicated multi-model ensemble built on data that wasn't clean. This article will contain almost no math, but there will be a fair amount of code. Some examples will use the dataset from Renthop company, which is used in the Two Sigma Connect: Rental Listing Inquiries Kaggle competition. In this task, you need to predict the popularity of a new rental listing, i.e. classify the listing into three classes: ['low', 'medium', 'high']. To evaluate the solutions, we will use the log loss metric (the smaller, the better).
Open Machine Learning Course. Topic 2. Visual data analysis with Python
In the field of Machine Learning, data visualization is not just making fancy graphics for reports; it is used extensively in day-to-day work for all phases of a project. To start with, visual exploration of data is the first thing one tends to do when dealing with a new task. We do preliminary checks and analysis using graphics and tables to summarize the data and leave out the less important details. It is much more convenient for us, humans, to grasp the main points this way than by reading many lines of raw data. It is amazing how much insight can be gained from seemingly simple charts created with available visualization tools. Next, when we analyze the performance of a model or report results, we also often use charts and images.
Open Machine Learning Course. Topic 1. Exploratory data analysis with Pandas
With this article, we, OpenDataScience, launch an open Machine Learning course. This is not aimed at developing another comprehensive introductory course on machine learning or data analysis (so this is not a substitute for fundamental education or online/offline courses/specializations and books). The purpose of this series of articles is to quickly refresh your knowledge and help you find topics for further advancement. Our approach is similar to that of the authors of Deep Learning book, which starts off with a review of mathematics and basics of machine learning -- short, concise, and with many references to other resources. The course is designed to perfectly balance theory and practice; therefore, each topic is followed by an assignment with a deadline in a week. You can also take part in several Kaggle Inclass competitions held during the course.